Clause Aggregation Using Linguistic Knowledge
نویسنده
چکیده
By combining multiple clauses into one single sentence, a text generation system can express the same amount of information in fewer words and at the same time, produce a great variety of complex constructions. In this paper, we describe hypotactic and paratactic operators for generating complex sentences from clause-sized semantic representations. These two types of operators are portable and reusable because they are based on general resources such as the lexicon and the grammar. 1 I n t r o d u c t i o n An expression is more concise than another expression if it conveys the same amount of information in fewer words. Complex sentences generated by combining clauses are more concise than corresponding simple sentences because multiple references to the recurring entities are removed. For example, clauses like "Jones is a patient" and "Jones has hypertension" can be combined into a more concise sentence "Jones is a hypertensive patient. '~ To illustrate the common occurrence of such repeated entities in generation, let us take a shipping company's database as an example. Each database tuple being conveyed is transformed into one or multiple propositions or clauses (we use these terms interchangeably throughout t he paper). Each proposition refers to a piece of information which usually corresponds to a simple sentence. The database might Contain multiple shipments to the same location possibly on the same day. Generating a sentence for each tuple separate ly would containrepetit ive and potentially redundant references to the same location Or date. Though we used a relational database as the example, the observation about recurring entities in the input is also valid for other types of input, such as execution traces from expert systems. CASPER (Clause Aggregation in Sentence P lannER) is a sentence planner which focuses on generating concise sentences. Clause aggregation can happen at three levels: inferential, rhetorical, and linguistic. At the inferential level, user modeling, domain knowledge, and common sense reasoning are used to reduce the number of concepts to convey. Such operations are implemented in the content planner and clauses are combined without consulting lexical resources. Text summarization is an application which uses inferential operators extensively. For example, the two sentences "John hit Mary" and "Mary kicked John" might imply that "John and Mary fought." To define a set of inferential operators for unrestricted text is beyond the state-of-art. Because it is unlikely that the inferential operators for our domains (medical briefings and telephone network plan descriptions ) will be reusable for other applications, we have directed our effort into aggregation operations at other levels. At the rhetorical level, clauses are combined based on their rhetorical relationships [Mann and Thompson, 1986], such as CONTRAST and CONDITION. We will take advantage of such information in future aggregation work. At the linguistic level, lexical and Syntactic information are used to combine clauses. In this paper, we concentrate on two types
منابع مشابه
Generating Natural Language Aggregations Using a Propositional Representation of Sets
We present a method for aggregating information from an internal, machine representation and building a text structure that allows us to express aggregations in natural language. Features of the knowledge representation system, a semantic network, allow us to produce an initial aggregation based on domain information and the competing aggregate structures. In the £nal stages of realization, the...
متن کاملExtraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency
Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in t...
متن کاملArithmetic Aggregation Operators for Interval-valued Intuitionistic Linguistic Variables and Application to Multi-attribute Group Decision Making
The intuitionistic linguistic set (ILS) is an extension of linguisitc variable. To overcome the drawback of using single real number to represent membership degree and non-membership degree for ILS, the concept of interval-valued intuitionistic linguistic set (IVILS) is introduced through representing the membership degree and non-membership degree with intervals for ILS in this paper. The oper...
متن کاملLinguistic Aggregation Operators for Linguistic Decision Making Based on the Dempster-Shafer Theory of Evidence
In this paper, we develop a new approach for decision making with Dempster-Shafer theory of evidence by using linguistic information. We suggest the use of different types of linguistic aggregation operators in the model. We then obtain as a result, the belief structure — linguistic ordered weighted averaging (BS-LOWA), the BS — linguistic hybrid averaging (BS-LHA) and a wide range of particula...
متن کاملHesitant Fuzzy Linguistic Arithmetic Aggregation Operators in Multiple Attribute Decision Making
In this paper, we investigate the multiple attribute decision making (MADM) problem based on the arithmetic and geometric aggregation operators with hesitant fuzzy linguistic information. Then, motivated by the idea of traditional arithmetic operation, we have developed some aggregation operators for aggregating hesitant fuzzy linguistic information: hesitant fuzzy linguistic weighted average (...
متن کامل